Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Assessing Opportunities of SYCL for Biological Sequence Alignment on GPU-based Systems (2211.10769v4)

Published 19 Nov 2022 in cs.PL and cs.DC

Abstract: Bioinformatics and Computational Biology are two fields that have been exploiting GPUs for more than two decades, being CUDA the most used programming language for them. However, as CUDA is an NVIDIA proprietary language, it implies a strong portability restriction to a wide range of heterogeneous architectures, like AMD or Intel GPUs. To face this issue, the Khronos Group has recently proposed the SYCL standard, which is an open, royalty-free, cross-platform abstraction layer, that enables the programming of a heterogeneous system to be written using standard, single-source C++ code. Over the past few years, several implementations of this SYCL standard have emerged, being oneAPI the one from Intel. This paper presents the migration process of the SW# suite, a biological sequence alignment tool developed in CUDA, to SYCL using Intel's oneAPI ecosystem. The experimental results show that SW# was completely migrated with a small programmer intervention in terms of hand-coding. In addition, it was possible to port the migrated code between different architectures (considering multiple vendor GPUs and also CPUs), with no noticeable performance degradation on 5 different NVIDIA GPUs. Moreover, performance remained stable when switching to another SYCL implementation. As a consequence, SYCL and its implementations can offer attractive opportunities for the Bioinformatics community, especially considering the vast existence of CUDA-based legacy codes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. Robert Dow: GPU shipments increase year-over-year in Q3. https://www.jonpeddie.com/press-releases/gpu-shipments-increase-year-over-year-in-q3 (2021) Nobile et al. [2016] Nobile, M.S., Cazzaniga, P., Tangherloni, A., Besozzi, D.: Graphics processing units in bioinformatics, computational biology and systems biology. Briefings in Bioinformatics 18(5), 870–885 (2016) https://doi.org/10.1093/bib/bbw058 De Oilveira Sandes et al. [2016] De Oilveira Sandes, E.F., Boukerche, A., De Melo, A.C.M.A.: Parallel optimal pairwise biological sequence comparison: Algorithms, platforms, and classification. ACM Comput. Surv. 48(4) (2016) https://doi.org/10.1145/2893488 Ohue et al. [2014] Ohue, M., Shimoda, T., Suzuki, S., Matsuzaki, Y., Ishida, T., Akiyama, Y.: Megadock 4.0: an ultra–high-performance protein–protein docking software for heterogeneous supercomputers. Bioinformatics 30(22), 3281–3283 (2014) Loukatou et al. [2014] Loukatou, S., Papageorgiou, L., Fakourelis, P., Filntisi, A., Polychronidou, E., Bassis, I., Megalooikonomou, V., Makałowski, W., Vlachakis, D., Kossida, S.: Molecular dynamics simulations through gpu video games technologies. Journal of molecular biochemistry 3(2), 64 (2014) Mrozek et al. [2014] Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nobile, M.S., Cazzaniga, P., Tangherloni, A., Besozzi, D.: Graphics processing units in bioinformatics, computational biology and systems biology. Briefings in Bioinformatics 18(5), 870–885 (2016) https://doi.org/10.1093/bib/bbw058 De Oilveira Sandes et al. [2016] De Oilveira Sandes, E.F., Boukerche, A., De Melo, A.C.M.A.: Parallel optimal pairwise biological sequence comparison: Algorithms, platforms, and classification. ACM Comput. Surv. 48(4) (2016) https://doi.org/10.1145/2893488 Ohue et al. [2014] Ohue, M., Shimoda, T., Suzuki, S., Matsuzaki, Y., Ishida, T., Akiyama, Y.: Megadock 4.0: an ultra–high-performance protein–protein docking software for heterogeneous supercomputers. Bioinformatics 30(22), 3281–3283 (2014) Loukatou et al. [2014] Loukatou, S., Papageorgiou, L., Fakourelis, P., Filntisi, A., Polychronidou, E., Bassis, I., Megalooikonomou, V., Makałowski, W., Vlachakis, D., Kossida, S.: Molecular dynamics simulations through gpu video games technologies. Journal of molecular biochemistry 3(2), 64 (2014) Mrozek et al. [2014] Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 De Oilveira Sandes, E.F., Boukerche, A., De Melo, A.C.M.A.: Parallel optimal pairwise biological sequence comparison: Algorithms, platforms, and classification. ACM Comput. Surv. 48(4) (2016) https://doi.org/10.1145/2893488 Ohue et al. [2014] Ohue, M., Shimoda, T., Suzuki, S., Matsuzaki, Y., Ishida, T., Akiyama, Y.: Megadock 4.0: an ultra–high-performance protein–protein docking software for heterogeneous supercomputers. Bioinformatics 30(22), 3281–3283 (2014) Loukatou et al. [2014] Loukatou, S., Papageorgiou, L., Fakourelis, P., Filntisi, A., Polychronidou, E., Bassis, I., Megalooikonomou, V., Makałowski, W., Vlachakis, D., Kossida, S.: Molecular dynamics simulations through gpu video games technologies. Journal of molecular biochemistry 3(2), 64 (2014) Mrozek et al. [2014] Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ohue, M., Shimoda, T., Suzuki, S., Matsuzaki, Y., Ishida, T., Akiyama, Y.: Megadock 4.0: an ultra–high-performance protein–protein docking software for heterogeneous supercomputers. Bioinformatics 30(22), 3281–3283 (2014) Loukatou et al. [2014] Loukatou, S., Papageorgiou, L., Fakourelis, P., Filntisi, A., Polychronidou, E., Bassis, I., Megalooikonomou, V., Makałowski, W., Vlachakis, D., Kossida, S.: Molecular dynamics simulations through gpu video games technologies. Journal of molecular biochemistry 3(2), 64 (2014) Mrozek et al. [2014] Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Loukatou, S., Papageorgiou, L., Fakourelis, P., Filntisi, A., Polychronidou, E., Bassis, I., Megalooikonomou, V., Makałowski, W., Vlachakis, D., Kossida, S.: Molecular dynamics simulations through gpu video games technologies. Journal of molecular biochemistry 3(2), 64 (2014) Mrozek et al. [2014] Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  2. Nobile, M.S., Cazzaniga, P., Tangherloni, A., Besozzi, D.: Graphics processing units in bioinformatics, computational biology and systems biology. Briefings in Bioinformatics 18(5), 870–885 (2016) https://doi.org/10.1093/bib/bbw058 De Oilveira Sandes et al. [2016] De Oilveira Sandes, E.F., Boukerche, A., De Melo, A.C.M.A.: Parallel optimal pairwise biological sequence comparison: Algorithms, platforms, and classification. ACM Comput. Surv. 48(4) (2016) https://doi.org/10.1145/2893488 Ohue et al. [2014] Ohue, M., Shimoda, T., Suzuki, S., Matsuzaki, Y., Ishida, T., Akiyama, Y.: Megadock 4.0: an ultra–high-performance protein–protein docking software for heterogeneous supercomputers. Bioinformatics 30(22), 3281–3283 (2014) Loukatou et al. [2014] Loukatou, S., Papageorgiou, L., Fakourelis, P., Filntisi, A., Polychronidou, E., Bassis, I., Megalooikonomou, V., Makałowski, W., Vlachakis, D., Kossida, S.: Molecular dynamics simulations through gpu video games technologies. Journal of molecular biochemistry 3(2), 64 (2014) Mrozek et al. [2014] Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 De Oilveira Sandes, E.F., Boukerche, A., De Melo, A.C.M.A.: Parallel optimal pairwise biological sequence comparison: Algorithms, platforms, and classification. ACM Comput. Surv. 48(4) (2016) https://doi.org/10.1145/2893488 Ohue et al. [2014] Ohue, M., Shimoda, T., Suzuki, S., Matsuzaki, Y., Ishida, T., Akiyama, Y.: Megadock 4.0: an ultra–high-performance protein–protein docking software for heterogeneous supercomputers. Bioinformatics 30(22), 3281–3283 (2014) Loukatou et al. [2014] Loukatou, S., Papageorgiou, L., Fakourelis, P., Filntisi, A., Polychronidou, E., Bassis, I., Megalooikonomou, V., Makałowski, W., Vlachakis, D., Kossida, S.: Molecular dynamics simulations through gpu video games technologies. Journal of molecular biochemistry 3(2), 64 (2014) Mrozek et al. [2014] Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ohue, M., Shimoda, T., Suzuki, S., Matsuzaki, Y., Ishida, T., Akiyama, Y.: Megadock 4.0: an ultra–high-performance protein–protein docking software for heterogeneous supercomputers. Bioinformatics 30(22), 3281–3283 (2014) Loukatou et al. [2014] Loukatou, S., Papageorgiou, L., Fakourelis, P., Filntisi, A., Polychronidou, E., Bassis, I., Megalooikonomou, V., Makałowski, W., Vlachakis, D., Kossida, S.: Molecular dynamics simulations through gpu video games technologies. Journal of molecular biochemistry 3(2), 64 (2014) Mrozek et al. [2014] Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Loukatou, S., Papageorgiou, L., Fakourelis, P., Filntisi, A., Polychronidou, E., Bassis, I., Megalooikonomou, V., Makałowski, W., Vlachakis, D., Kossida, S.: Molecular dynamics simulations through gpu video games technologies. Journal of molecular biochemistry 3(2), 64 (2014) Mrozek et al. [2014] Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  3. De Oilveira Sandes, E.F., Boukerche, A., De Melo, A.C.M.A.: Parallel optimal pairwise biological sequence comparison: Algorithms, platforms, and classification. ACM Comput. Surv. 48(4) (2016) https://doi.org/10.1145/2893488 Ohue et al. [2014] Ohue, M., Shimoda, T., Suzuki, S., Matsuzaki, Y., Ishida, T., Akiyama, Y.: Megadock 4.0: an ultra–high-performance protein–protein docking software for heterogeneous supercomputers. Bioinformatics 30(22), 3281–3283 (2014) Loukatou et al. [2014] Loukatou, S., Papageorgiou, L., Fakourelis, P., Filntisi, A., Polychronidou, E., Bassis, I., Megalooikonomou, V., Makałowski, W., Vlachakis, D., Kossida, S.: Molecular dynamics simulations through gpu video games technologies. Journal of molecular biochemistry 3(2), 64 (2014) Mrozek et al. [2014] Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ohue, M., Shimoda, T., Suzuki, S., Matsuzaki, Y., Ishida, T., Akiyama, Y.: Megadock 4.0: an ultra–high-performance protein–protein docking software for heterogeneous supercomputers. Bioinformatics 30(22), 3281–3283 (2014) Loukatou et al. [2014] Loukatou, S., Papageorgiou, L., Fakourelis, P., Filntisi, A., Polychronidou, E., Bassis, I., Megalooikonomou, V., Makałowski, W., Vlachakis, D., Kossida, S.: Molecular dynamics simulations through gpu video games technologies. Journal of molecular biochemistry 3(2), 64 (2014) Mrozek et al. [2014] Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Loukatou, S., Papageorgiou, L., Fakourelis, P., Filntisi, A., Polychronidou, E., Bassis, I., Megalooikonomou, V., Makałowski, W., Vlachakis, D., Kossida, S.: Molecular dynamics simulations through gpu video games technologies. Journal of molecular biochemistry 3(2), 64 (2014) Mrozek et al. [2014] Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  4. Ohue, M., Shimoda, T., Suzuki, S., Matsuzaki, Y., Ishida, T., Akiyama, Y.: Megadock 4.0: an ultra–high-performance protein–protein docking software for heterogeneous supercomputers. Bioinformatics 30(22), 3281–3283 (2014) Loukatou et al. [2014] Loukatou, S., Papageorgiou, L., Fakourelis, P., Filntisi, A., Polychronidou, E., Bassis, I., Megalooikonomou, V., Makałowski, W., Vlachakis, D., Kossida, S.: Molecular dynamics simulations through gpu video games technologies. Journal of molecular biochemistry 3(2), 64 (2014) Mrozek et al. [2014] Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Loukatou, S., Papageorgiou, L., Fakourelis, P., Filntisi, A., Polychronidou, E., Bassis, I., Megalooikonomou, V., Makałowski, W., Vlachakis, D., Kossida, S.: Molecular dynamics simulations through gpu video games technologies. Journal of molecular biochemistry 3(2), 64 (2014) Mrozek et al. [2014] Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  5. Loukatou, S., Papageorgiou, L., Fakourelis, P., Filntisi, A., Polychronidou, E., Bassis, I., Megalooikonomou, V., Makałowski, W., Vlachakis, D., Kossida, S.: Molecular dynamics simulations through gpu video games technologies. Journal of molecular biochemistry 3(2), 64 (2014) Mrozek et al. [2014] Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  6. Mrozek, D., Brożek, M., Małysiak-Mrozek, B.: Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. Journal of molecular modeling 20(2), 1–17 (2014) Group [2009] Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  7. Group, K.: The OpenCL Specification. Version 1.0 (2009). https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  8. Jin, Z., Vetter, J.S.: Performance portability study of epistasis detection using sycl on nvidia gpu. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. BCB ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3535508.3545591 . https://doi.org/10.1145/3535508.3545591 Christgau and Steinke [2020] Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  9. Christgau, S., Steinke, T.: Porting a Legacy CUDA Stencil Code to oneAPI. In: 2020 IEEE IPDPSW, pp. 359–367 (2020). https://doi.org/10.1109/IPDPSW50202.2020.00070 Korpar and Sikic [2013] Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  10. Korpar, M., Sikic, M.: SW# - GPU-enabled exact alignments on genome scale. Bioinformatics 29(19), 2494–2495 (2013) https://doi.org/10.1093/bioinformatics/btt410 Costanzo et al. [2022] Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  11. Costanzo, M., Rucci, E., García-Sánchez, C., Naiouf, M., Prieto-Matías, M.: Migrating cuda to oneapi: A smith-waterman case study. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds.) Bioinformatics and Biomedical Engineering, pp. 103–116. Springer, Cham (2022) De O. Sandes et al. [2016] De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  12. De O. Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., De Melo, A.C.M.A.: Masa: A multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28–12831 (2016) https://doi.org/10.1145/2858656 Needleman and Wunsch [1970] Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  13. Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970) https://doi.org/10.1016/0022-2836(70)90057-4 Smith and Waterman [1981] Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  14. Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) Hasan and Al-Ars [2011] Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  15. Hasan, L., Al-Ars, Z.: In: Lopes, H., Cruz, L. (eds.) An Overview of Hardware-based Acceleration of Biological Sequence Alignment, pp. 187–202. Intech, ??? (2011) Isaev [2006] Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  16. Isaev, A.: Introduction to Mathematical Methods in Bioinformatics, 1st edn. Universitext. Springer, Heidelberg, Germany (2006) Daily [2016] Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  17. Daily, J.: Parasail: Simd c library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 17 (2016) https://doi.org/10.1186/s12859-016-0930-z [19] Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  18. Mneimneh, S.: Computational Biology Lecture 4: Overlap detection, Local Alignment, Space Efficient Needleman-Wunsch Korpar et al. [2016] Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  19. Korpar, M., Sosic, M., Blazeka, D., Sikic, M.: SWdb: GPU-Accelerated Exact Sequence Similarity Database Search. PLOS ONE 10(12), 1–11 (2016) https://doi.org/10.1371/journal.pone.0145857 Khoo et al. [2013] Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  20. Khoo, A.A., Ogrizek-Tomaš, M., Bulović, A., Korpar, M., Gürler, E., Slijepčević, I., Šikić, M., Mihalek, I.: ExoLocator—an online view into genetic makeup of vertebrate proteins. Nucleic Acids Research 42(D1), 879–881 (2013) https://doi.org/10.1093/nar/gkt1164 https://academic.oup.com/nar/article-pdf/42/D1/D879/3609050/gkt1164.pdf Ghorpade et al. [2012] Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  21. Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: Gpgpu processing in cuda architecture. arXiv preprint arXiv:1202.4347 (2012) Codeplay Software [2023] Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  22. Software: ComputeCpp Comunity Edition. https://developer.codeplay.com/products/computecpp/ce/home (2023) Intel Corp [2021] Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  23. Intel Corp: Intel oneAPI. https://software.intel.com/en-us/oneapi (2021) [25] The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  24. The triSYCL project. https://github.com/triSYCL/triSYCL (2023) Aksel Alpay [2023] Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  25. Alpay: OpenSYCL implementation. https://github.com/AdaptiveCpp/AdaptiveCpp (2023) Alpay et al. [2022] Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  26. Alpay, A., Soproni, B., Wünsche, H., Heuveline, V.: Exploring the possibility of a hipsycl-based implementation of oneapi. In: International Workshop on OpenCL. IWOCL’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3529538.3530005 . https://doi.org/10.1145/3529538.3530005 Alpay and Heuveline [2023] Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  27. Alpay, A., Heuveline, V.: One pass to bind them: The first single-pass sycl compiler with unified code representation across backends. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585351 . https://doi.org/10.1145/3585341.3585351 Rucci et al. [2018a] Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  28. Rucci, E., Garcia, C., Botella, G., Giusti, A.E.D., Naiouf, M., Prieto-Matias, M.: Oswald: Opencl smith–waterman on altera’s fpga for large protein databases. The International Journal of High Performance Computing Applications 32(3), 337–350 (2018) https://doi.org/10.1177/1094342016654215 Rucci et al. [2018b] Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  29. Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology 12(Suppl 5), 96 (2018) https://doi.org/10.1186/s12918-018-0614-6 NVIDIA [2022] NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  30. NVIDIA: Nsight Compute. https://developer.nvidia.com/nsight-compute (2022) Tsai et al. [2021] Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  31. Tsai, Y.M., Cojean, T., Anzt, H.: Porting a sparse linear algebra math library to Intel GPUs (2021) Costanzo et al. [2021] Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  32. Costanzo, M., Rucci, E., Sanchez, C.G., Naiouf, M.: Early experiences migrating cuda codes to oneapi. In: Short Papers of the 9th Conference on Cloud Computing Conference, Big Data & Emerging Topics, pp. 14–18 (2021). http://sedici.unlp.edu.ar/handle/10915/125138 Martínez et al. [2022] Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  33. Martínez, P.A., Peccerillo, B., Bartolini, S., García, J.M., Bernabé, G.: Applying intel’s oneapi to a machine learning case study. Concurrency and Computation: Practice and Experience 34(13), 6917 (2022) https://doi.org/10.1002/cpe.6917 https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.6917 Faqir-Rhazoui and García [2023] Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  34. Faqir-Rhazoui, Y., García, C.: Exploring the performance and portability of the k-means algorithm on sycl across cpu and gpu architectures. J. Supercomput. 79(16), 18480–18506 (2023) https://doi.org/10.1007/s11227-023-05373-2 Jin and Vetter [2021] Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  35. Jin, Z., Vetter, J.: Evaluating cuda portability with hipcl and dpct. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 371–376 (2021). https://doi.org/10.1109/IPDPSW52791.2021.00065 Castaño et al. [2022] Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  36. Castaño, G., Faqir-Rhazoui, Y., García, C., Prieto-Matías, M.: Evaluation of intel’s dpc++ compatibility tool in heterogeneous computing. Journal of Parallel and Distributed Computing 165, 120–129 (2022) https://doi.org/10.1016/j.jpdc.2022.03.017 Yong et al. [2021] Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  37. Yong, W., Yongfa, Z., Scott, W., Wang, Y., Qing, X., Chen, W.: Developing medical ultrasound imaging application across gpu, fpga, and cpu using oneapi. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456680 . https://doi.org/10.1145/3456669.3456680 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  38. Marinelli, E., Appuswamy, R.: Xjoin: Portable, parallel hash join across diverse xpu architectures with oneapi. In: Proceedings of the 17th International Workshop on Data Management on New Hardware. DAMON ’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3465998.3466012 . https://doi.org/10.1145/3465998.3466012 Jin and Vetter [2022] Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  39. Jin, Z., Vetter, J.S.: Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2190–2195 (2022). https://doi.org/10.1109/BIBM55620.2022.9995222 Haseeb et al. [2021] Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  40. Haseeb, M., Ding, N., Deslippe, J., Awan, M.: Evaluating performance and portability of a core bioinformatics kernel on multiple vendor gpus. In: 2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 68–78 (2021). https://doi.org/10.1109/P3HPC54578.2021.00010 Solis-Vasquez et al. [2023] Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  41. Solis-Vasquez, L., Mascarenhas, E., Koch, A.: Experiences migrating cuda to sycl: A molecular docking case study. In: Proceedings of the 2023 International Workshop on OpenCL. IWOCL ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3585341.3585372 . https://doi.org/10.1145/3585341.3585372 Marinelli and Appuswamy [2021] Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  42. Marinelli, E., Appuswamy, R.: OneJoin: Cross-architecture, Scalable Edit Similarity Join for DNA Data Storage Using oneAPI. In: ACM (ed.) ADMS 2021, 12th International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, in Conjunction with VLDB 2021, 16 August 2021, Copenhagen, Denmark, Copenhagen (2021) Johnston et al. [2020] Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  43. Johnston, B., Vetter, J.S., Milthorpe, J.: Evaluating the performance and portability of contemporary sycl implementations. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 45–56 (2020). https://doi.org/10.1109/P3HPC51967.2020.00010 Breyer et al. [2021] Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  44. Breyer, M., Daiß, G., Pflüger, D.: Performance-portable distributed k-nearest neighbors using locality-sensitive hashing and sycl. In: International Workshop on OpenCL. IWOCL’21. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3456669.3456692 . https://doi.org/10.1145/3456669.3456692 Shilpage and Wright [2023] Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  45. Shilpage, W.R., Wright, S.A.: An investigation into the performance and portability of sycl compiler implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing, pp. 605–619. Springer, Cham (2023) Rognes [2011] Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  46. Rognes, T.: Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinformatics 12:221 (2011) Constantinescu et al. [2021] Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  47. Constantinescu, D.-A., Navarro, A., Corbera, F., Fernández-Madrigal, J.-A., Asenjo, R.: Efficiency and productivity for decision making on low-power heterogeneous cpu+gpu socs. The Journal of Supercomputing 77(1), 44–65 (2021) https://doi.org/10.1007/s11227-020-03257-3 Nozal and Bosque [2021] Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  48. Nozal, R., Bosque, J.L.: Exploiting co-execution with oneapi: Heterogeneity from a modern perspective. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021: Parallel Processing, pp. 501–516. Springer, Cham (2021) Marowka [2022] Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002 Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
  49. Marowka, A.: Reformulation of the performance portability metric. Software: Practice and Experience 52(1), 154–171 (2022) https://doi.org/10.1002/spe.3002 https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.3002
Citations (3)

Summary

We haven't generated a summary for this paper yet.